智能论文笔记

Borrowing from Similar Code: A Deep Learning NLP-Based Approach for Log Statement Automation

Sina Gholamian , Paul A. S. Ward

分类：机器学习

2021-12-02

软件开发人员将源代码内的日志记录嵌入为现代软件开发中的命令占空税，因为日志文件是跟踪运行时系统问题和故障排除系统管理任务所必需的。但是，当前的日志记录过程主要是手动，因此，日志语句的适当放置和内容仍然是挑战。为了克服这些挑战，旨在自动化日志放置并预测其内容的方法，即“来到哪里以及登录的地方”，具有很高的兴趣。因此，我们专注于通过利用源代码克隆和自然语言处理（NLP）来预测日志语句的位置（即，其中）和描述（即，什么），因为这些方法为日志预测提供了额外的上下文和优点。具体而言，我们指导我们的研究三项研究问题（RQS）:( RQ1）如何利用代码片段，即代码克隆，用于日志语句预测如何？（RQ2）如何扩展方法以自动执行日志语句的描述？（RQ3）所提出的方法是如何有效的日志位置和描述预测？为了追求我们的RQ，我们对七个开源Java项目进行了实验研究。我们介绍了更新和改进的日志感知代码克隆检测方法，以预测日志记录语句（RQ1）的位置。然后，我们纳入自然语言处理（NLP）和深度学习方法，以自动化日志语句的描述预测（RQ2）。我们的分析表明，我们的混合NLP和Code-CC'd检测方法（NLP CC'd）优于常规克隆探测器，平均地查找日志声明位置，并在Bleu和Rouge分数上实现了40.86％的性能，以预测伐木的描述与先前研究（RQ3）相比的陈述。

translated by 谷歌翻译

A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis

Sina Gholamian , Paul A. S. Ward

分类：机器学习

2021-10-24

日志广泛用于记录软件系统的运行时信息，例如时间戳和事件的重要性，日志源的唯一ID，以及任务执行的一部分。日志的丰富信息使系统开发人员（和运算符）能够监视其系统的运行时行为，并进一步追踪系统问题并对生产设置中的日志数据进行分析。然而，利用日志的现有研究分散，这限制了新研究人员在这一领域的能力，以便快速进入目前积极的研究人员进一步推进这一领域的速度和篮板。因此，本文调查并提供了系统的文献综述和映射当代记录实践和日志陈述的挖掘和监控技术及其应用，如系统故障检测和诊断。我们研究了大量会议和杂志，这些论文出现在顶级同伴的场地。此外，我们借鉴了持续研究的高级趋势，并将出版物分类为细分。最后，基于我们在本调查中的整体观测，我们提供了一系列挑战和机遇，将导致学术界和工业研究人员在向前移动。

translated by 谷歌翻译

AttEntropy: Segmenting Unknown Objects in Complex Scenes using the Spatial Attention Entropy of Semantic Segmentation Transformers

Krzysztof Lis , Matthias Rottmann , Sina Honari , Pascal Fua , Mathieu Salzmann

分类：计算机视觉

2022-12-29

Vision transformers have emerged as powerful tools for many computer vision tasks. It has been shown that their features and class tokens can be used for salient object segmentation. However, the properties of segmentation transformers remain largely unstudied. In this work we conduct an in-depth study of the spatial attentions of different backbone layers of semantic segmentation transformers and uncover interesting properties. The spatial attentions of a patch intersecting with an object tend to concentrate within the object, whereas the attentions of larger, more uniform image areas rather follow a diffusive behavior. In other words, vision transformers trained to segment a fixed set of object classes generalize to objects well beyond this set. We exploit this by extracting heatmaps that can be used to segment unknown objects within diverse backgrounds, such as obstacles in traffic scenes. Our method is training-free and its computational overhead negligible. We use off-the-shelf transformers trained for street-scene segmentation to process other scene types.

translated by 谷歌翻译

Agent-based Modeling and Simulation of Human Muscle For Development of Software to Analyze the Human Gait

Sina Saadati , Mohammadreza Razzazi

分类：人工智能

2022-12-24

In this research, we are about to present an agentbased model of human muscle which can be used in analysis of human movement. As the model is designed based on the physiological structure of the muscle, The simulation calculations would be natural, and also, It can be possible to analyze human movement using reverse engineering methods. The model is also a suitable choice to be used in modern prostheses, because the calculation of the model is less than other machine learning models such as artificial neural network algorithms and It makes our algorithm battery-friendly. We will also devise a method that can calculate the intensity of human muscle during gait cycle using a reverse engineering solution. The algorithm called Boots is different from some optimization methods, so It would be able to compute the activities of both agonist and antagonist muscles in a joint. As a consequence, By having an agent-based model of human muscle and Boots algorithm, We would be capable to develop software that can calculate the nervous stimulation of human's lower body muscle based on the angular displacement during gait cycle without using painful methods like electromyography. By developing the application as open-source software, We are hopeful to help researchers and physicians who are studying in medical and biomechanical fields.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

Fairness in Contextual Resource Allocation Systems: Metrics and Incompatibility Results

Nathanael Jo , Bill Tang , Kathryn Dullerud , Sina Aghaei , Eric Rice , Phebe Vayanos

分类：机器学习

2022-12-04

We study critical systems that allocate scarce resources to satisfy basic needs, such as homeless services that provide housing. These systems often support communities disproportionately affected by systemic racial, gender, or other injustices, so it is crucial to design these systems with fairness considerations in mind. To address this problem, we propose a framework for evaluating fairness in contextual resource allocation systems that is inspired by fairness metrics in machine learning. This framework can be applied to evaluate the fairness properties of a historical policy, as well as to impose constraints in the design of new (counterfactual) allocation policies. Our work culminates with a set of incompatibility results that investigate the interplay between the different fairness metrics we propose. Notably, we demonstrate that: 1) fairness in allocation and fairness in outcomes are usually incompatible; 2) policies that prioritize based on a vulnerability score will usually result in unequal outcomes across groups, even if the score is perfectly calibrated; 3) policies using contextual information beyond what is needed to characterize baseline risk and treatment effects can be fairer in their outcomes than those using just baseline risk and treatment effects; and 4) policies using group status in addition to baseline risk and treatment effects are as fair as possible given all available information. Our framework can help guide the discussion among stakeholders in deciding which fairness metrics to impose when allocating scarce resources.

translated by 谷歌翻译

Visual Question Answering From Another Perspective: CLEVR Mental Rotation Tests

Christopher Beckham , Martin Weiss , Florian Golemo , Sina Honari , Derek Nowrouzezahrai , Christopher Pal

分类： (统计)机器学习 | 计算机视觉 | 机器学习

2022-12-03

Different types of mental rotation tests have been used extensively in psychology to understand human visual reasoning and perception. Understanding what an object or visual scene would look like from another viewpoint is a challenging problem that is made even harder if it must be performed from a single image. We explore a controlled setting whereby questions are posed about the properties of a scene if that scene was observed from another viewpoint. To do this we have created a new version of the CLEVR dataset that we call CLEVR Mental Rotation Tests (CLEVR-MRT). Using CLEVR-MRT we examine standard methods, show how they fall short, then explore novel neural architectures that involve inferring volumetric representations of a scene. These volumes can be manipulated via camera-conditioned transformations to answer the question. We examine the efficacy of different model variants through rigorous ablations and demonstrate the efficacy of volumetric representations.

translated by 谷歌翻译

Universal Feature Selection Tool (UniFeat): An Open-Source Tool for Dimensionality Reduction

Sina Tabakhi , Parham Moradi

分类：机器学习 | 人工智能 | (统计)机器学习

2022-11-30

The Universal Feature Selection Tool (UniFeat) is an open-source tool developed entirely in Java for performing feature selection processes in various research areas. It provides a set of well-known and advanced feature selection methods within its significant auxiliary tools. This allows users to compare the performance of feature selection methods. Moreover, due to the open-source nature of UniFeat, researchers can use and modify it in their research, which facilitates the rapid development of new feature selection algorithms.

translated by 谷歌翻译

Multimodal Learning for Multi-Omics: A Survey

Sina Tabakhi , Mohammod Naimul Islam Suvon , Pegah Ahadian , Haiping Lu

分类：人工智能 | 机器学习 | (统计)机器学习

2022-11-29

With advanced imaging, sequencing, and profiling technologies, multiple omics data become increasingly available and hold promises for many healthcare applications such as cancer diagnosis and treatment. Multimodal learning for integrative multi-omics analysis can help researchers and practitioners gain deep insights into human diseases and improve clinical decisions. However, several challenges are hindering the development in this area, including the availability of easily accessible open-source tools. This survey aims to provide an up-to-date overview of the data challenges, fusion approaches, datasets, and software tools from several new perspectives. We identify and investigate various omics data challenges that can help us understand the field better. We categorize fusion approaches comprehensively to cover existing methods in this area. We collect existing open-source tools to facilitate their broader utilization and development. We explore a broad range of omics data modalities and a list of accessible datasets. Finally, we summarize future directions that can potentially address existing gaps and answer the pressing need to advance multimodal learning for multi-omics data analysis.

translated by 谷歌翻译

RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation

Reza Rasti , Armin Biglari , Mohammad Rezapourian , Ziyun Yang , Sina Farsiu

分类：计算机视觉

2022-09-26

光学相干断层扫描（OCT）有助于眼科医生评估黄斑水肿，流体的积累以及微观分辨率的病变。视网膜流体的定量对于OCT引导的治疗管理是必需的，这取决于精确的图像分割步骤。由于对视网膜流体的手动分析是一项耗时，主观和容易出错的任务，因此对快速和健壮的自动解决方案的需求增加了。在这项研究中，提出了一种名为Retifluidnet的新型卷积神经结构，用于多级视网膜流体分割。该模型受益于层次表示使用新的自适应双重注意（SDA）模块的纹理，上下文和边缘特征的学习，多个基于自适应的Skip Connections（SASC）以及一种新颖的多尺度深度自我监督学习（DSL）方案。拟议的SDA模块中的注意机制使该模型能够自动提取不同级别的变形感知表示，并且引入的SASC路径进一步考虑了空间通道相互依存，以串联编码器和解码器单元，从而提高了表示能力。还使用包含加权版本的骰子重叠和基于边缘的连接损失的联合损失函数进行了优化的retifluidnet，其中将多尺度局部损失的几个分层阶段集成到优化过程中。该模型根据三个公开可用数据集进行验证：润饰，Optima和Duke，并与几个基线进行了比较。数据集的实验结果证明了在视网膜OCT分割中提出的模型的有效性，并揭示了建议的方法比现有的最新流体分割算法更有效，以适应各种图像扫描仪器记录的视网膜OCT扫描。

translated by 谷歌翻译